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Abstract 

Bessiere et al. (AAAI'08) showed that several intractable global constraints can be efficiently 
propagated when certain natural problem parameters are small. In particular, the complete 
propagation of a global constraint is fixed-parameter tractable in k - the number of holes in 
domains - whenever bound consistency can be enforced in polynomial time; this applies to the 
global constraints AtMost-N Value and Extended Global Cardinality (EGC). 

In this paper we extend this line of research and introduce the concept of reduction to a 
problem kernel, a key concept of parameterized complexity, to the field of global constraints. 
In particular, we show that the consistency problem for AtMost-NValue constraints admits 
a linear time reduction to an equivalent instance on 0(k 2 ) variables and domain values. This 
small kernel can be used to speed up the complete propagation of NValue constraints. We 
contrast this result by showing that the consistency problem for EGC constraints does not 
admit a reduction to a polynomial problem kernel unless the polynomial hierarchy collapses. 

1 Introduction 

Constraint programming (CP) offers a powerful framework for efficient modeling and solving of a 
wide range of hard problems [Rossi et at, 2006]. At the heart of efficient CP solvers are so-called 
global constraints that specify patterns that frequently occur in real-world problems. Efficient 
propagation algorithms for global constraints help speed up the solver significantly [van Hoeve and 
Katriel, 2006]. For instance, a frequently occurring pattern is that we require that certain variables 
must all take different values (e.g., activities requiring the same resource must all be assigned dif- 
ferent times). Therefore most constraint solvers provide a global AllDifferent constraint and 
algorithms for its propagation. Unfortunately, for several important global constraints a complete 
propagation is NP-hard, and one switches therefore to incomplete propagation such as bound consis- 
tency [Bessiere et al., 2004]. In their AAAI'08 paper, Bessiere et al. [2008] showed that a complete 
propagation of several intractable constraints can efficiently be done as long as certain natural 
problem parameters are small, i.e., the propagation is fixed-parameter tractable [Downey and Fel- 
lows, 1999]. Among others, they showed fixed-parameter tractability of the AtLeast-NValue and 
Extended Global Cardinality (EGC) constraints parameterized by the number of "holes" in 
the domains of the variables. If there are no holes, then all domains are intervals and complete 
propagation is polynomial by classical results; thus the number of holes provides a way of scaling 
up the nice properties of constraints with interval domains. 
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In this paper we bring this approach a significant step forward, picking up a long-term research 
objective suggested by Bessiere et al. [2008] in their concluding remarks: whether intractable global 
constraints admit a reduction to a problem kernel or kernelization. 

Kernelization is an important algorithmic technique that has become the subject of a very 
active field in state-of-the-art combinatorial optimization (see, e.g., the references in [Fellows, 2006; 
Guo and Niedermeier, 2007; Rosamond, 2010]). Kernelization can be seen as a preprocessing with 
performance guarantee that reduces a problem instance in polynomial time to an equivalent instance, 
the kernel, whose size is a function of the parameter [Fellows, 2006; Guo and Niedermeier, 2007; 
Fomin, 2010]. 

Once a kernel is obtained, the time required to solve the instance is a function of the parameter 
only and therefore independent of the input size. Consequently one aims at kernels that are as 
small as possible; the kernel size provides a performance guarantee for the preprocessing. Some 
NP-hard combinatorial problems such as k- Vertex Cover admit polynomially sized kernels, for 
others such as A:- Path an exponential kernel is the best one can hope for [Bodlaender et at, 2009a]. 

Kernelization fits perfectly into the context of CP where preprocessing and data reduction 
(e.g., in terms of local consistency algorithms, propagation, and domain filtering) are key methods 
[Bessiere, 2006; van Hoeve and Katriel, 2006]. 

Results Do the global constraints AtMost-NValue and EGC admit polynomial kernels? We 
show that the answer is "yes" for the former and "no" for the latter. 

More specifically, we present a linear time preprocessing algorithm that reduces an AtMost- 
NValue constraint C with k holes to a consistency-equivalent AtMost-NValue constraint C 
of size polynomial in k. In fact, C has at most 0(k 2 ) variables and 0(k 2 ) domain values. We 
also give an improved branching algorithm checking the consistency of C in time 0(1.6181 fe ). 
The combination of kernelization and branching yields efficient algorithms for the consistency and 
propagation of (AtMost-)N Value constraints. 

On the other hand, we show that a similar result is unlikely for the EGC constraint: One 
cannot reduce an EGC constraint C with k holes in polynomial time to a consistency-equivalent 
EGC constraint C of size polynomial in k. This result is subject to the complexity theoretic 
assumption that NP % coNP/poly whose failure implies the collapse of the Polynomial Hierarchy 
to its third level, which is considered highly unlikely by complexity theorists. 

2 Formal Background 

Parameterized Complexity A parameterized problem P is a subset of S* x N for some finite 
alphabet S. For a problem instance (x, k) G S* x N we call x the main part and k the parameter. A 
parameterized problem P is fixed-parameter tractable (FPT) if a given instance (x, k) can be solved 
in time 0{f(k) ■ p{\x\)) where / is an arbitrary computable function of k and p is a polynomial in 
the input size |x|. 

Kernels A kernelization for a parameterized problem P C E* x N is an algorithm that, given 
(x, k) £ E* x N, outputs in time polynomial in |x| + k a pair (x', k 1 ) EE'xN such that (i) (x, k) e P 
if and only if (x', k') 6 P and (ii) |x'| + k' < g(k), where g is an arbitrary computable function. The 
function g is referred to as the size of the kernel. If g is a polynomial then we say that P admits a 
polynomial kernel. 

Global Constraints An instance of the constraint satisfaction problem (CSP) consists of a set 
of variables, each with a finite domain of values, and a set of constraints specifying allowed combi- 
nations of values for some subset of variables. We denote by dom(x) the domain of a variable x and 
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by scope(C) the subset of variables involved in a constraint C. An instantiation is an assignment a 
of values to variables such that a(x) G dom(x) for each variable x G scope(C). A constraint can be 
specified extensionally by listing all legal instantiations of its variables or intensionally, by giving 
an expression involving the variables in the constraint scope [Smith, 2006]. Global constraints are 
certain extensionally described constraints involving an arbitrary number of variables [van Hoeve 
and Katriel, 2006]. For example, an instantiation is legal for an AllDifferent global constraint 
C if it assigns pairwise different values to the variables in scope(C). 

Consistency A global constraint C is consistent if there is a legal instantiation of its variables. 
The constraint C is hyper arc consistent {HAC) if for each variable x G scope(C) and each 
value v G dom(x), there is a legal instantiation a such that a(x) = v (in that case we say that 
C supports v for x). In the literature, HAC is also called domain consistent or generalized arc 
consistent. The constraint C is bound consistent if when a variable x G scope{C) is assigned the 
minimum or maximum value of its domain, there are compatible values between the minimum and 
maximum domain value for all other variables in scope(C). The main algorithmic problems for a 
global constraint C are the following: Consistency, to decide whether C is consistent, and Enforcing 
HAC, to remove from all domains the values that are not supported by the respective variable. 

It is clear that if HAC can be enforced in polynomial time for a constraint C , then the consistency 
of C can also be decided in polynomial time (we just need to see if any domain became empty). The 
reverse is true for constraints that satisfy a certain closure property (see [van Hoeve and Katriel, 
2006]), which is the case for most constraints of practical use, and in particular for all constraints 
considered below. The same correspondence holds with respect to fixed-parameter tractability. 
Hence, we will focus mainly on Consistency. 

3 NValue Constraints 

The NValue constraint was introduced by Pachet and Roy [1999]. For a set of variables X and a 
variable N, NValue(A, N) is consistent if there is an assignment a such that exactly a(N) different 
values are used for the variables in X. AllDifferent is the special case where dom(N) ~ {\X\}. 
Beldiceanu [2001] and Bessiere et al. [2006] decompose NValue constraints into two other global 
constraints: AtMost-NValue and AtLeast-NValue, which require that at most N or at least 
N values are used for the variables in X, respectively. The Consistency problem is NP-complete for 
NValue and AtMost-NValue constraints, and polynomial time solvable for AtLeast-NValue 
constraints. 

For checking the consistency of an AtMost-NValue constraint C, we are given an instance 
1 consisting of a set of variables X — {x\, . . . ,x n }, a totally ordered set of values D, a map 
dom : X — > 2 D assigning a non-empty domain dom(x) C D to each variable x G X , and an integer 
A. 1 A hole in a subset D' C D is a couple (u, w) G D' x D' . such that there is a v G D \ D' with 
u < v < w and there is no v' G D' with u < v' < w. We denote the number of holes in the domain 
of a variable x G X by #holes(x). The parameter of the consistency problem for AtMost-N Value 
constraints is k — J2xex #holes(x). An interval I = [^1,^2] of a variable x is an inclusion-wise 
maximal hole-free subset of its domain. Its left endpoint and right endpoint r(I) are the values 
v\ and «2, respectively. Fig. 1 gives an example of an instance and its interval representation. We 
assume that instances are given by a succinct description, in which the domain of a variable is given 
by the left and right endpoint of each of its intervals. As the number of intervals of the instance 

1 If D is not part of the input (or is very large) , we may construct D by sorting the set of all endpoints of intervals 
in time 0((n + k) log(n + k)). Since, w.l.o.g., a solution contains only endpoints of intervals, this step does not 
compromise the correctness. 
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Figure 1: Interval representation of an AtMost-NValue instance I — (X, D, dom, N), with 
X = {xi, . . . , xi 5 }, N = 6, D = {1, . . . , 14}, and dom(x 1 ) = {1, 2}, dom(x 2 ) = {2, 3, 10}, etc. 



I = (X, D, dom, N) is n + k, its size is \I\ — 0(n + \D\ + k). In case dom is given by an extensive 
list of the values in the domain of each variable, a succinct representation can be computed in linear 
time. 

A greedy algorithm by Beldiceanu [2001] checks the consistency of an AtMost-NValue con- 
straint in linear time when all domains are intervals (i.e., k = 0). Further, Bessiere et al. [2008] have 
shown that Consistency (and Enforcing HAC) is FPT, parameterized by the number of holes, for 
all constraints for which bound consistency can be enforced in polynomial time. A simple algorithm 
for checking the consistency of AtMost-NValue goes over all instances obtained from restricting 
the domain of each variable to one of its intervals, and executes the algorithm of [Beldiceanu, 2001] 
for each of these 2 k instances. The running time of this FPT algorithm is clearly bounded by 
0(2 k ■ \Z\). 

In the realm of parameterized complexity it is then natural to ask whether AtMost-NValue 
has a polynomial kernel. In the next subsection, we give a linear time kernelization algorithm. We 
then prove its correctness and that the size of the produced instance can be bounded by 0(k 2 ). 
In Subsection 3.3, we give an FPT algorithm, which uses the kernelization algorithm, for checking 
the consistency of an AtMost-NValue constraint in time 0(1.6181 fc fc 2 + HAC can then be 
enforced by applying this algorithm 0(\D\) times. 

3.1 Kernelization Algorithm 

Let X — (X, D, dom, N) be an instance for the consistency problem for AtMost-NValue con- 
straints. The algorithm is more intuitively described using the interval representation of the in- 
stance. The friends of an interval / are the other intervals of Fs variable. An interval is optional 
if it has at least one friend, and required otherwise. For a value v £ D, let ivl(w) denote the set of 
intervals containing v. 

A solution for I is a subset S C D of at most N values such that there exists an instantiation 
assigning the values in S to the variables in X. The algorithm may detect for some value ugD, that, 
if the problem has a solution, then it has a solution containing v. In this case, the algorithm selects 
v, i.e., it removes all variables whose domain contains v, it removes v from D, and it decrements N 
by one. The algorithm may detect for some value v G D, that, if the problem has a solution, then it 
has a solution not containing v. In this case, the algorithm discards v, i.e., it removes v from every 
domain and from D. (Note that no new holes are created with respect to D \ {v}.) The algorithm 
may detect for some variable x, that every solution for (X \ {x}, D, dom\x\{ x }: N) contains a value 
from dom(x). In that case, it removes x. 

The algorithm sorts the intervals by increasing right endpoint (ties are broken arbitrarily). Then, 
it exhaustively applies the following three reduction rules. 
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Figure 2: Instance obtained from the instance of Fig. 1 by exhaustively applying rules Red-C, 
Red-Dom, and Red-Unit. 



Red-C: If there are two intervals /, I' such that I' C I and I' is required, then remove the variable 
of/. 

Red-Dom: If there are two values v,v' 6 D such that ivl(V) C ivl(v), then discard v' . 

Red-Unit: If \dom(x)\ = 1 for some variable x, then select the value in dom(x). 

In the example from Fig. 1, Red-C removes the variables £5 and xg because xiq C x' 5 and xy C xs, 
Red-Dom removes the values 1 and 5, Red-Unit selects 2, which deletes variables x\ and X2, and 
Red-Dom removes 3 from D. The resulting instance is depicted in Fig. 2. 

After none of the previous rules apply, the algorithm scans the remaining intervals from left 
to right (i.e., by increasing right endpoint). An interval that has already been scanned is either a 
leader or a follower of a subset of leaders. Informally, for a leader L, if a solution contains r(L), 
then there is a solution containing r(L) and the right endpoint of each of its followers. 

The algorithm scans the first intervals up to, and including, the first required interval. All these 
intervals become leaders. 

The algorithm then continues scanning intervals one by one. Let / be the interval that is 
currently scanned and I p be the last interval that was scanned. The active intervals are those that 
have already been scanned and intersect I p . A popular leader is a leader that is either active or has 
at least one active follower. 

• If I is optional, then / becomes a leader, the algorithm continues scanning intervals until 
scanning a required interval; all these intervals become leaders. 

• If / is required, then it becomes a follower of all popular leaders that do not intersect / and 
that have no follower intersecting /. If all popular leaders have at least two followers, then set 
N :— N — 1 and merge the second-last follower of each popular leader with the last follower 
of the corresponding leader; i.e., for every popular leader, the right endpoint of its second-last 
follower is set to the right endpoint of its last follower, and then the last follower of every 
popular leader is removed. 

After having scanned all the intervals, the algorithm exhaustively applies the reduction rules Red- 
C, Red-Dom, and Red-Unit again. 

In the example from Fig. 2, variable xq is merged with Xq, and x? with xiq. Red-Dom then 
removes the values 7 and 8, resulting in the instance depicted in Fig. 3. 

3.2 Correctness and Kernel Size 

Let I' = (X' , D' , dom' , N') be the instance resulting from applying one operation of the kerneliza- 
tion algorithm to an instance X = (A, D, dom, N). An operation is an instruction which modifies 
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the instance: Red-C, Red-Dom, Red-Unit, and merge. We show that there exists a solution S 
for X if and only if there exists a solution S' for X' . A solution is nice if each of its elements is the 
right cndpoint of some interval. Clearly, for every solution, a nice solution of the same size can be 
obtained by shifting each value to the next right endpoint of an interval. Thus, when we construct 
S' from S (or vice- versa), we may assume that S is nice. 

Reduction Rule Red-C is sound because a solution for X is a solution for X' and vice- versa, 
because any solution X' contains a value v of I C I', as I is required. Reduction Rule Red-Dom 
is correct because if v' € S, then S' := (S \ {«'}) U {v} is a solution for X' and for X. Reduction 
Rule Red-Unit is obviously correct (S — S' U dom(x)). 

After having applied these 3 reduction rules, observe that the first interval is optional and 
contains only one value. Suppose the algorithm has started scanning intervals. By construction, 
the following properties apply to I'. 

Property 1. A follower does not intersect any of its leaders. 

Property 2. If 1,1' are two (distinct) followers of the same leader, then I and I' do not intersect. 

Before proving the correctness of the merge operation, let us first show that the subset of leaders 
of a follower is not empty. 

Claim 1. Every interval that has been scanned is either a leader or a follower of at least one leader. 

Proof. First, note that Red-Dom ensures that each domain value in D is the left endpoint of some 
interval and the right endpoint of some interval. Let / be the interval that is currently scanned and 
I p be the previously scanned interval. If I p or I is optional, then I becomes a leader. Suppose I and 
I p are required. We have that > l(/ p ), otherwise / would have been removed by Red-C. By 
Rule Red-Dom, there is some interval Ig with r{It) = \{I P )- If It is a leader, / becomes a follower 
of It ; otherwise / becomes a follower of It's leader. □ 

The following two lemmas prove the correctness of the merge operation. Recall that X' is an 
instance obtained from X by one application of the merge operation. 

Lemma 1. If S is a nice solution forX, then there exists a solution S' for X' with S" C S. 

Proof. Consider the step where the kernelization algorithm applies the merge operation. At that 
step, each popular leader has at least two followers and the algorithm merges the last two followers 
of each popular leader and decrements N by one. The currently scanned interval is /. Let F2 
denote the set of all intervals that are the second-last follower of a popular leader, and F\ the set 
of all intervals that are the last follower of a popular leader before merging. Let M denote the set 
of merged intervals. Clearly, every interval of F% U F2 U M is required as all followers are required. 



6 



Claim 2. Every interval in 7\ intersects 1(7). 



Proof. Let 7i G 7\. By construction, r(7i) € I, as I becomes a follower of every popular leader 
that has no follower intersecting 7, and no follower has a right endpoint larger than r(7). Moreover, 
l(7i) < 1(7) as no follower is a strict subset of 7 by Red-C and the fact that all followers are 
required. □ 

Let I~ be the interval of F 2 with the largest right endpoint. Let L be a leader of I~ . By construction 
and Red-C, L is a leader of 7 as well and is thus popular. Let t\ G S fl 7 be the smallest value of 
S that intersects 7 and let t 2 G 5 fl I~ be the largest value of £ that intersects 7~. By Property 2, 
t 2 < h. 

Claim 3. S contains no value i such that t 2 < to < t\. 

Proof. Suppose S contained such a value to. As S is nice, t is the right endpoint of some interval 
7 . As t 2 is the rightmost value intersecting S and any interval in F 2 , 7 is not in F 2 . As 7 has 
already been scanned, and was scanned after every interval in F 2 , lo is in F\. However, by Claim 
2, 7 intersects 1(7). As no scanned interval has a larger right endpoint than 7, to G S n 7, which 
contradicts the fact that t\ is the smallest value in S fl 7 and that to < d 

Claim 4. Suppose I\ G 7\ and 7 2 G F 2 are the last and second-last follower of a popular leader L 1 , 
respectively. Let M\ 2 G M denote the interval obtained from merging I 2 with I\. If t 2 G I 2 , then 
h G A7i 2 . 

Proof. For the sake of contradiction, assume t 2 G 7 2 , but t\ ^ A7 12 . As t 2 < t\, we have that 
t\ > r(il7 12 ) = r(7i). But then S is not a solution as S n 1\ — by Claim 3 and the fact that 
t 2 < l(7i). □ 

Claim 5. 7/7' is an interval with t 2 G I' , then I' G F 2 U F\. 

Proof. First, suppose 7' is a leader. As every leader has at least two followers when 7 is scanned, 
I' has two followers whose left endpoint is larger than r(7') > t 2 (by Property 1) and smaller than 
I CO < h (by Red-C). Thus, at least one of them is included in the interval (t 2 ,ti) by Property 2, 
which contradicts S being a solution by Claim 3. 

Similarly, if 7' is a follower of a popular leader, but not among the last two followers of any 
popular leader, Claim 3 leads to a contradiction as well. 

Finally, if 7' is a follower, but has no popular leader, then it is to the left of some popular leader, 
and thus to the left of t 2 . □ 

Consider the set T 2 of intervals that intersect t 2 . By Claim 5, T 2 C F 2 U F\. For every interval 
I' G T 2 fl F 2 , the corresponding merged interval of I' intersects t\ by Claim 4. For every interval 
7' G T 2 n 7\ , and every interval I" G F 2 with which 7' is merged, 5* contains some value x G 7" 
with x < t 2 . Thus, S' := S\ {t 2 } is a solution for 1' . □ 

Lemma 2. 7/5' is a nice solution for 1' , then there exists a solution S fori with S' C S. 

Proof. As in the previous proof, consider the step where the kernelization algorithm applies the 
merge operation. The currently scanned interval is 7. Let F 2 and 7\ denote the set of all intervals 
that are the second-last and last follower of a popular leader before merging, respectively. Let M 
denote the set of merged intervals. 

By Claim 2 from the previous proof, every interval of M intersects 1(7). On the other hand, 
every interval of X' whose right endpoint intersects 7 is in A7, by construction. Thus, S' contains 
the right endpoint of some interval of M. Let t\ denote the smallest such value, and let I\ denote 
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the interval of X with r(Ji) = t\ (due to Red-C, there is a unique such interval). Let I 2 denote 
the interval of X with the smallest right endpoint such that there is a leader L whose second-last 
follower is I 2 and whose last follower is I\, and let ti := r(I 2 ). 

Claim 6. Let I[ 6 F\ and I' 2 6 F 2 be two intervals from X that are merged into one interval M' l2 
ofT. Ifh e M[ 2 , then t 2 S I' 2 . 

Proof. Suppose ti 6 M' l2 but t 2 ^ I' 2 . We consider two cases. In the fist case, I 2 C (t 2 , !(/()). But 
then, I 2 would have become a follower of L, which contradicts that I\ is the last follower of L. In 
the second case, r(I 2 ) < t 2 . But then, 1\ is a follower of the same leader as I[, as l(/i) < and 
thus Ii = I[. By definition of I 2 , however, t 2 — r(I 2 ) < r(I 2 ), a contradiction. □ 

By the previous claim, a solution £ for X is obtained from a solution S' for X' by setting S := 
S'Uih}. □ 

After having scanned all the intervals, Reduction Rules Red-C, Red-Dom, and Red-Unit are 
applied again, and we have already proved their correctness. 

Thus, the kernelization algorithm returns an equivalent instance. To bound the kernel size 
by a polynomial in k, let X* = (V*,D*, dom*,N*) be the instance resulting from applying the 
kernelization algorithm to an instance I = (V,D, dom, N). 

Property 3. X and X* have at most 2k optional intervals. 

Property 3 holds for X as every optional interval is adjacent to at least one hole and each hole 
is adjacent to two optional intervals. It holds for X* as the kernelization algorithm introduces no 
holes. 

Lemma 3. X* has at most Ak leaders. 

Proof. Consider the unique step of the algorithm that creates leaders. An optional interval is 
scanned, the algorithm continues scanning intervals until scanning a required interval, and all these 
scanned intervals become leaders. As every interval is scanned only once, for every optional interval, 
there are at most 2 leaders. By Property 3, the number of leaders is thus at most Ak. □ 

Lemma 4. Every leader has at most Ak followers. 

Proof. Consider all steps where a newly scanned interval becomes a follower, but is not merged with 
another interval. In each of these steps, the popular leader L r with the rightmost right endpoint 
either 

(a) has no follower and intersects J, or 

(b) has no follower and does not intersect /, or 

(c) has one follower and intersects /. 

Now, let L be some leader and let us consider a period where no optional interval is scanned. Let us 
bound the number of intervals that become followers of L during this period without being merged 
with another interval. If the number of followers of L increases in Situation (a), it does not increase 
in Situation (a) again during this period, as no other follower of L may intersect /. After Situation 
(b) occurs, Situation (b) does not occur again during this period, as / becomes a follower of L r . 
Moreover, the number of followers of L does not increase during this period in Situation (c) after 
Situation (b) has occurred, as no other follower of L may intersect /. After Situation (c) occurs, 
the number of followers of L does not increase in Situation (c) again during this period, as no other 
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follower of L may intersect /. Thus, at most 2 followers are added to L in each period. As the first 
scanned interval is optional, Property 3 bounds the number of periods by 2k. Thus, L has at most 
4fc followers. □ 



As, by Claim 1, every interval of I* is either a leader or a follower of at least one leader, Lemmas 
3 and 4 imply that I* has 0(k 2 ) intervals, and thus \X*\ = 0(k 2 ). Because of Reduction Rule 
Red-Dom, every value in D* is the right endpoint and the left endpoint of some interval, and thus, 
\D*\ = 0(k 2 ). 

Using a counting sort algorithm with satellite data (see, e.g., [Cormen et ai, 2009]), the initial 
sorting of the n + k intervals can be done in time 0(n + \D\ + k). To facilitate the application 
of Red-C, counting sort is actually used twice to also sort by increasing left endpoint the sets of 
intervals with coinciding right endpoint. An optimized implementation applies Red-C, Red-Dom 
and Red-Unit simultaneously in one pass through the intervals, as one rule might trigger the 
other. To guarantee a linear running time for the scan-and-merge phase of the algorithm, only the 
first follower of a leader stores a pointer to the leader; all other followers store a pointer to the 
previous follower. Due to space limitations, we omit the formal details about the implementation 
and running time analysis of the kernelization algorithm. We arrive at our main theorem. 

Theorem 1. The Consistency problem for AtMost-N Value constraints, parameterized by the 
number k of holes, admits a linear time reduction to a problem kernel with 0(k 2 ) variables and 
0(k 2 ) domain values. 

Using the succinct description of the domains, the size of the kernel can be bounded by 0(k 2 ). 

Remark: Denoting var(w) = {x G X : v G dom(x)}, Rule Red-Dom can be generalized to discard 
any v' G D for which there exists a v G D such that var(w') C var(w) at the expense of a higher 
running time. 

3.3 Improved FPT Algorithm and HAC 

Using the kernel from Theorem 1 and the simple algorithm described in the beginning of this section, 
one arrives at a 0(2 k k 2 + \X\) time algorithm for checking the consistency of an AtMost-NValue 
constraint. Borrowing ideas from the kernelization algorithm, we now reduce the exponential depen- 
dency on k in the running time. The speed-ups due to this branching algorithm and the kernelization 
algorithm lead to a speed-up for enforcing HAC for AtMost-NValue constraints (by Corollary 1) 
and for enforcing HAC for NValue constraints (by the decomposition of [Bessiere et ai, 2006]). 

Theorem 2. The Consistency problem for AtMost-NValue constraints admits a 0(p k k 2 + 
time algorithm, where k is the number of holes in the domains of the input instance T, and p = 
< 1.6181. 

Proof. The first step of the algorithm invokes the kernelization algorithm and obtains an equivalent 
instance I' with 0(k 2 ) intervals in time 0(\I\). 

Now, we describe a branching algorithm checking the consistency of I'. Let 1\ denote the first 
interval of I' (in the ordering by increasing right endpoint). I\ is optional. Let T\ denote the 
instance obtained from I' by selecting r(2i) and exhaustively applying Reduction Rules Red-Dom 
and Red-Unit. Let I2 denote the instance obtained from T' by removing I\ (if 1\ had exactly one 
friend, this friend becomes required) and exhaustively applying Reduction Rules Red-Dom and 
Red-Unit. Clearly, T' is consistent if and only if T\ or I2 is consistent. 

Note that both T\ and I2 have at most k — 1 holes. If either T\ or I2 has at most k — 2 holes, 
the algorithm recursively checks whether at least one of I\ and Z 2 is consistent. If both T\ and I2 
have exactly k — 1 holes, we note that in I', 
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(1) Ii has one friend, 



(2) no other optional interval intersects I\, and 

(3) the first interval of both X\ and I2 is //, which is the third optional interval in X' if the second 
optional interval is the friend of I\ , and the second optional interval otherwise. 

Thus, the instance obtained from Xi by removing Ji's friend and applying Red-Dom and Red- 
Unit may differ from X2 only in N. Let s\ and S2 denote the number of values smaller than 
r(If) that have been selected to obtain X\ and I2 from X', respectively. If s\ < S2, then the non- 
consistency of X\ implies the non-consistency of I2. Thus, the algorithm need only recursively check 
whether X\ is consistent. On the other hand, if s\ > S2, then the non-consistency of I2 implies the 
non-consistency of X\. Thus, the algorithm need only recursively check whether Z 2 is consistent. 

The recursive calls of the algorithm may be represented by a search tree labeled with the number 
of holes of the instance. As the algorithm either branches into only one subproblem with at most 
k — 1 holes, or two subproblems with at most k — 1 and at most k — 2 holes, respectively, the number 
of leaves of this search tree is T(k) < T{k - 1) + T(k - 2), with T(0) = T(l) = 1. Using standard 
techniques in the analysis of exponential time algorithms (see, e.g., [Fomin and Kratsch, 2010]), 
and by noticing that the number of operations executed at each node of the search tree is 0(k 2 ), 
the running time of the branching algorithm can be upper bounded by 0(p k k 2 ). □ 

For the example of Fig. 3, the instances X\ and X2 are computed by selecting the value 4, and 
removing the interval X3 , respectively. The reduction rules select the value 9 for X\ and the values 
6 and 10 for X2. Both instances start with the interval in, and the algorithm recursively solves X\ 
only, where the values 12 and 13 are selected, leading to the solution {4, 9, 12, 13} for the kernelized 
instance, which corresponds to the solution {2, 4, 7, 9, 12, 13} for the instance of Fig. 1. 

Corollary 1. H AC for an AtMost-NValue constraint can be enforced in time 0(p k ■ k 2 ■ \D\ + 
\X\ • where k is the number of holes in the domains of the input instance X = (A, D, dom, N), 
and p = < 1.6181. 

Proof. We first remark that if a value v can be filtered from the domain of a variable x (i.e., v has no 
support for x), then v can be filtered from the domain of all variables, as for any legal instantiation 
a with a(x') — v, x' E X \ {x}, the assignment obtained from a by setting a(x) :— v is a legal 
instantiation as well. Also, filtering the value v creates no new holes as the set of values can be set 
to D\{v}. 

Now we enforce HAC by applying 0(\D\) times the algorithm from Theorem 2. Assume the 
instance X — (X,D, dom,N) is consistent. If (A, D, dom,N — 1) is consistent, then no value can 
be filtered. Otherwise, check, for each v e D 7 whether the instance obtained from selecting v is 
consistent and filter v if this is not the case. □ 

4 Extended Global Cardinality Constraints 

An EGC constraint C is specified by a set of variables scope(C) — {xi, . . . , x n } and for each value 
v 6 U»=i dom(xi) a set D(v) of non-negative integers. The constraint is consistent if each variable 
can take a value from its domain such that the number of variables taking a value v belongs to the 
set D(v). 

The Consistency problem for EGC constraints is NP-hard [Quimper et at, 2004]. However, 
if all sets D(-) are intervals, then consistency can be checked in polynomial time using network 
flows [Regin, 1996]. By the result of Bessiere et al. [2008], the Consistency problem for EGC 
constraints is fixed-parameter tractable, parameterized by the number of holes in the sets D(-). 
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Thus Regin's result generalizes to instances that are close to the interval case. However, it is 
unlikely that EGC constraints admit a polynomial kernel. 

Theorem 3. The Consistency problem for EGC constraints, parameterized by the number of holes 
in the sets D(-), does not admit a polynomial kernel unless NP C coNP/poly. 

Proof. We establish the theorem by a combination of results from Bodlaender et al. [2009b], Fort- 
now and Santhanam [2008], and Quimper et al. [2004]. We need the following definitions. The 
unparameterized version of a parameterized problem P C S* x N is UP(P) = { a;#l fe : (x, k) e 
P} C (X U {#})* where 1 is an arbitrary symbol from £ and fj= is a new symbol not in S. Let 
P, Q C E* x N be parameterized problems. We say that P is polynomial parameter reducible to Q 
if there is a polynomial time computable function / : E* x N — > S* x N and a polynomial p, such 
that for all (a;, k) G E* x N, we have (x, fc) e P if and only if (V, fc') = f(x, k) e Q, and k' < p(k). 
We prove the theorem by combining three known results. 

(1) [Bodlaender et al., 2009b] Let P and Q be parameterized problems such that UP(-P) is NP- 
complete, UP(Q) is in NP, and P is polynomial parameter reducible to Q. If Q has a 
polynomial kernel, then P has a polynomial kernel. 

(2) [Fortnow and Santhanam, 2008] The problem of deciding the satisfiability of a CNF formula 
(SAT), parameterized by the number of variables, does not admit a polynomial kernel, unless 
NP C coNP/poly. 

(3) [Quimper et al, 2004] Given a CNF formula F on k variables, one can construct in polynomial 
time an EGC constraint Cf such that 

(i) for each value v of Cf, D(v) — {0, i v } for an integer i v > 0, 

(ii) i v > 1 for at most 2k values v, and 

(iii) F is satisfiable if and only if Cf is consistent. 

Thus, the number of holes in Cp is at most twice the number of variables of F. 

We observe that (3) is a polynomial parameter reduction from SAT, parameterized by the number 
of variables, to the Consistency problem for EGC constraints, parameterized by the number of 
holes. Hence the theorem follows from (1) and (2). □ 

5 Conclusion 

We have introduced the concept of kernelization to the field of constraint processing, providing both 
positive and negative results for the important global constraints NValue and EGC, respectively. 
On the positive side, we have developed an efficient linear-time kernelization algorithm for the 
consistency problem for AtMost-NValue constraints, and have shown how it can be used to 
speed up the complete propagation of NValue and related constraints. On the negative side, we 
have established a theoretical result which indicates that EGC constraints do not admit polynomial 
kernels. 

Our algorithms are efficient and the theoretical worst-case time bounds do not include large 
hidden constants. We therefore believe that the algorithms are practical, but we must leave an 
empirical evaluation for future research. We hope that our results stimulate further research on 
kernelization algorithms for constraint processing. 
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